MONTREAL NOTES ON QUADRATIC FOURIER ANALYSIS 

BEN GREEN 



Abstract. These are notes to accompany four lectures that I gave at the School on 
- - , additive combinatorics, held m Montreal, Quebec between March 30th and April 5th 

^ ■ 2006. 

f~^ , My aim is to introduce "quadratic fourier analysis" in so far as we understand it at 

^SI ' the present time. Specifically, we will describe "quadratic objects" of various types and 

^ , their relation to additive structures, particularly four-term arithmetic progressions. 

Mh' I will focus on qualitative results, referring the reader to the literature for the many 

■^ , interesting quantitative questions in this theory. Thus these lectures have a distinctly 

"soft" flavour in many places. 

Some of the notes cover unpublished work which is joint with Terence Tao. This 
will be published more formally at some future juncture. 
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1. Lecture 1 



^ ■ Topics to be covered: 

a^■ 

^ I • Introduction. The finite field philosophy. 

Tj- ' • Review of notation and basic properties of the Fourier transform 

O . • Counting 3- and 4-terni arithmetic progressions using the Cowers U^- and U^- 

Q , norms: generalised von Neumann theorems. 
• Inverse theorem for the Cowers ^^-norm. 

_. • The "quadratic" example for the Cowers f/^-norm. 

r^ ■ • Brief revision of key results from additive combinatorics. 



What is "quadratic Fourier analysis?". The aim of this series of lectures is to give 
/\ ■ a reasonably detailed answer to that question, at least in so far as is possible at the 

j^ ■ present time. 

It would, however, be presumptuous to suppose that any reader would venture to the 
end of these notes in order to discover the meaning of the title, so we begin with a very 
brief introduction. 

Fourier analysis, or "linear" Fourier analysis as we shall call it in these notes, is a multi- 
faceted subject. One rather small part of it is concerned with solving linear equations. 
Two examples of theorems which may be proven using some kind of study of the Fourier 
transform are 
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2 BEN GREEN 

• (Chowla/van der Corput) There are infinitely many 3-terni arithmetic progres- 
sions of primes. 

• (Roth) Let 6 > Ohe fixed. Then if A^ > A^o(^) is sufficiently large, any subset A C 
{1, . . . , A^} with size at least 6N contains three distinct elements in arithmetic 
progression. 

Note that an arithmetic progression of length three is defined by a single linear equation 

Xi+ X3 = 2X2. 

Standard Fourier analysis fails in many situations where we are interested in a pair of 
linear equations. The natural example here is a progression of length four, which is 
defined by the equations xi + x^ = 2x2, a;2 + X4 = 2x3. This is the situation where 
quadratic Fourier analysis is appropriate. Thus by developing the methods that we will 
talk about in these lectures, it is possible to prove 

• (Green- Tao) There are infinitely many 4-term arithmetic progressions of primes. 

• (Szemeredi) Let 6 > he fixed. Then if A^ > A^o(^) is sufficiently large, any 
subset A C {1, . . . , A^} with size at least 6N contains three distinct elements in 
arithmetic progression. 

In fact we will not prove either of these theorems in this course, since we will be working 
in a model setting. A common theme in additive combinatorics is the consideration of 
finite field models. A full discussion may be found in [12], but the basic idea is as 
follows. For many problems in additive combinatorics one is interested in the interval 
{1, . . . , A^}. However, it is convenient to work in a group, and so one often uses various 
technical devices in order to place the problem at hand in Z/A^Z. Once this is done, it 
is often easy to formulate an analogous question inside an arbitrary finite abelian group 
G. In most applications that we know of, this more general problem is scarcely harder 
to solve than in the specific case G = Z/A^Z. However, there is a family of groups, 
namely the groups F" where p is a small prime, in which it can be relatively easy to 
work. Techniques used to prove theorems in this setting can often be used to guide 
proof techniques in Z/A^Z, which provide theorems of actual number theoretic interest. 

In this series of lectures we will focus almost exclusively on the group G = F5. I am 
rather fond of the prime 5 since it is the smallest for which the notion of a 4-term 
arithmetic progression is sensible. 

We will conclude with a discussion of the general case at the end, in as much detail as 
time permits. It turns out that the theory for Z/A^Z is surprisingly rich, and there are 
strong connections with the ergodic theory techniques that are discussed in the lectures 
of Bryna Kra in these Proceedings. 

Notation. Opinion seems to be converging in additive combinatorics about what con- 
stitutes the "standard" notation, and I will endeavour to keep to these norms. If X is 
any finite set and f : X ^ C is any function then we write 

E,.ex/(x):=|Xri^/(x). 
xex 

This means that it is often possible to avoid worrying about normalising factors. 
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Unless specified otherwise, we will set G := Fg and write N := \G\ = 5". Any character 
on G (that is, homomorphism 7 : G ^ C^) has the form x ^-^ cu^ ^, where u := e^'^*/^ 
and r G F5 is a vector. We write G for FI? when considered as the group of characters 
in this way. 

If / : G ^ C is any function then we define its Fourier transform / : G ^ C by 

/(r) := E^^GfixP"-^'. 

We distinguish the trivial character corresponding to r = 0, which takes the value 1 for 
all a; G G. If /, (? : G — > C are two functions then we define the convolution f*g : G —>■ C 

by 

(/ * 9){x) ■■= EyeGf{x)g{y - x). 
Note that when working on G we always use the Haar measure which assigns weight 
|G|^^ to any x & G. When working on G we use the counting measure which assigns 
weight 1 to every r E G. These measures are dual to one another, which in practice 
means that in formula such as those in Lemma 11.11 below one can simply write "Ex&g 
and XlreG' ^"^^ thereafter be untroubled by normalising factors. 

When we talk of L^ norms, these will always be taken with respect to the appropriate 
underlying measure. Thus 

||/||i:=E,.eG|/(x)|, 
whereas 



(Ei/»n^ 



r&G 

I will be assuming familiarity with the basic properties of the Fourier transform, which 
are all straightforward consequences of the orthogonality relations 

N if a; = 
otherwise 



u 



E 

r 

and 



1 if r = 
otherwise. 



Lemma 1.1 (Basic properties of the Fourier transform). Suppose that f,g:G^C are 
any two functions. Then 



(1) We have /(O) = Kx^Gfix). For any r we have \f{r)\ ^ \\j \\i. 

(2) (Parseval identity) We have 

^xeGf{x)g{x) = ^f{r)g{r). 

In particular ||/||2 = Wfh- 

(3) (Inversion) We have 

reG 

(4) (Convolution) We have {f * g)^ = fg. 



Irltu"'"'^^. 
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The last item here illustrates how we will denote the Fourier transform of expressions 
E for which it would be too cumbersome to write E. 

Let us now start with the main business of these lectures. Let G be a finite abelian 
group with order N which is coprime to 6, and let /i, . . . , /4 : G — > [—1, 1] be functions. 
In these notes a central role will be played by the multilinear operators A3 and A4, 
defined by 

A3(/i, /2, /a) := E,,d/i(x)/2(x + d)h{x + 2d) 
and 

A4(/i, /2, h, h) ■■= E,,,/i(x)/2(x + d)Mx + 2d)h{x + 3d). 

Thus A3 counts the number of 3-term arithmetic progressions "along the /," , whilst A4 
counts the number of 4-term progressions |j 

When the functions /« are characteristic functions, the operators A3 and A4 may be 
interpreted combinatorially. 

Observation 1.2. Suppose that fi = 1^., where Ai (1 G is a set. Then A^IIai, 1^2) Ias) 
is equal to A^~^ times the number of triples (01,02,03) G Ai x A2 x A3 which are 
in arithmetic progression. Similarly, A4(l^^, l^^, l^ig, 1^^) is equal to N'"^ times the 
number of quadruples (01,02,03,04) G ^1X^2X^43X^4 which are in arithmetic 
progression. 

There are certainly many situations in which one might be interested in counting the 
number of 3- or 4-term progressions inside a set. To do this, we normally proceed as 
follows. If A C G is a set with size aN , then write /a := 1^ — «• This is called the 
balanced function of A, and it has expected value 0. 

Lemma 1.3 (Balanced function decomposition). Suppose that Ai,...,A4 C G, and 
that \Ai\ = aiN. Then we have 

A3(lyii, lyi2) 1^3) = «i«2a3 + (seven other terms), (1.1) 

where each of the seven terms has the form A^Iqi, g2, gs) where each gi is either fA^ or 
ai and at least one is equal to fA^ ■ Similarly 

A4(1ai, lyi25 1^35 1^4) = cy.ia2Ci3ai + (fifteen other terms), (1.2) 

where each of the fifteen terms has the form A4{gi,g2, g^, g^) where each gi is either fA^ 
or ai and at least one is equal to fAi ■ 

Let us specialise to the case Ai = A2 = A3 = A4 = A for simplicity, and write 
|y4| = aN. What do we "expect" A3(lyi, 1^, 1^) and A4(1a, Ia, ^a, 1a) to be? It is not 
hard to see that for a "random" set A, generated by tossing a coin which comes up 
heads with probability a to decide whether each x & G lies in A, the expected value 
of A3(l^, 1^, 1^) is approximately a^, whilst the expected value of A4(lyi, 1^, 1a, Ia) 
is approximately a^. Note that these quantities are exactly the "main terms" in the 
expansions of Lemma II. 3[ It is thus reasonable to suggest that the other seven terms 



Whilst we will talk exclusively about 3- and 4-term arithmetic progressions, the reader should note 
that much of what we have to say may be adapted to more general problems where it is of interest to 
count the number of solutions to a linear equation, or to a pair of linear equations. 
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in (II. ip measure some kind of "non-uniformity" of A relevant to 3-term progressions, 
whilst the fifteen terms in (11.21) do the same for 4-term progressions. 

Let us make a preliminary definition. 

Definition 1.4 (Uniformity along progressions). Let A C G be a set with |v4| = aN, 
and let /^ := \a — ol be the balanced function oi A. Let b G (0, 1) be a parameter. Then 
we say that A exhibits 5-uniformity along 3-term progressions if whenever we have three 
functions gi,g2, 93 -^ [—1, 1], at least one of which is equal to Ja, then 

|A3(5'l,5'2,fi'3)| ^ S. 

We define non-unifomity along 4-term progressions similarly. 

Remark. It is not, at first sight, obvious that there are any sets which are uniform along 
progressions. 

Lemma 1.5. Suppose that A (1 G is a set with \A\ = aN. If A is 5-uniform along 
3-term, progressions, then 

If A is 5-uniform along 4-term progressions, then 

|A4(U,lA,lA,U)-a"|^155. 

Proof. Immediate consequence of Lemma II. 3[ D 

The following question will be a recurring theme of these lectures: 

Question 1.6. Suppose that A is not 5-uniform along 3- or 4-term progressions. Can 
we say something "useful" about A7 

Of course, the notion of "useful" is a subjective one. The reader may assume, however, 
that the mere failure of Definition 11.41 does not constitute "useful" . We will see that if 
A is not uniform along 3-term progressions, then it exhibits "linear" behaviour, whilst 
functions which are not uniform along 4-term progressions are somehow "quadratic" . 

Formulating, proving, and using statements of this type is our main goal in these notes. 

Question 11.61 may be answered very satisfactorily using Fourier analysis. The key tool 
is the following simple lemma, whose proof is an amusing exercise using the basic prop- 
erties of the Fourier transform. 

Lemma 1.7. Let /i, f2, fs '■ G —>■ M. be any three functions. Then 

A3(/i, /2, /3) = Yl fiir)f2{-2r)Ur) D 

r 

Proposition 1.8 (Inverse result for 3-term progressions, I). Suppose that A is not 6- 
uniform along 3-term progressions. Then ||/a||oo ^ S, that is to say there is some r E G 
such that |/a(^)| ^ S- 

Proof. Suppose that 

|A3(5'l,fi'2,/A)| > S 
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for some functions gi,g2 '■ G — *■ [—1,1] (the analysis of the other two cases, when gi = /^ 
or g2 = /a, is more-or-less identical). We have, by Lemma [1.71 the formula 

M9u92,fA) = J2Mr)g2{-2r)U{r). 

r 

Thus by Cauchy-Schwarz and Parseval's identity we infer that 

^ < I ^?l(r)?2(-2r)/2(r)| ^ ||/A||oo||^l||2||f2||2 ^ ||/a||oo. □ 

r 

This is a very clean result, but the method of proof (appealing to a formula in Fourier 
analysis) has not, so far, proved amenable to generalisation. One way to generalise an 
argument is to first try and find a more longwinded, less natural looking approach and 
try and generalise that. We will describe such an approach now, though we hope that 
any reader looking back on this section later on will not consider it so unnatural. Note 
that the result is the same as Proposition II. 8^ but the bound is slightly worse. 

Proposition 1.9 (Inverse result for 3-term progressions, II). Suppose that A is not 
S-uniform along 3-term arithmetic progressions. Then ||/a||oo ^ ^^• 

Proof. Let us first observe that 

M{gug2, Ia) = ^y^,y^gi{-yi)g2{\y2)fA{yi + 1/2). 
This is a simple reparametrisation. Applying the Cauchy-Schwarz inequality, we have 

|A3(^i,^2,/A)r ^ ^y,\^yM-yi)fA{yi + y2)? 

= ^yuy[,y2fA{yi + y2)fA{y'i + y2)gi{-yi)gi{-y'i)- 

Applying Cauchy-Schwarz again, we have 

\hz{gug2jA)\'^^y,,y{\^yjA{yi + y2)fA{y'i + y2)? 

= ^yi,y[,y2,y'jA{yi + 2/2)/a(2/i + y2)fA{yi + y'2)fA{y'i + 1/2)- (1-3) 

This last expression is called the (fourth power of) the Gowers U'^-norm of Ja- Thus 
we define 

\\fA\\u2 ■■= Ey^,y'^,y2,y'jA{yi + Z/2)/a(Z/1 + 1/2)/a(1/1 + y'2) f A{y'i + Z/2)- (1-4) 

It is often useful to write this in the alternative form 

||/a||^2 = '&^^hiMfA{x)fA{x + hi)fA{x + h2)fA{x + hi + /la). 

It is not hard to show that || ■ \\1j2 is a norm using the Cauchy-Schwarz inequality several 
times. We will not make much use of this fact, and refer the reader to |9] for the proof. 

Note that (11.31) implies that if \K^{gi,g2, fA)\ ^ ^ then ||/a||(72 ^ 5. 

What now? Another way to see that || ■ \\u2 is a norm is to observe that 

ii/rc.2 = ii/*/iiHii(/*/)"ii2 = 

Thus if 1 1 /a 1 1 1/2 ^ 5 then we have 

^HII/aIU^II/aIILII/aII^^II/. 
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which concludes the proof in the case that \A3{gi,g2, fA)\ ^ S- Again, the cases when 
Ja = Qi or g2 can be dealt with very similarly, and are left to the reader; the parametri- 
sations leading to fll.3p must be modified slightly. D 

At the moment, it is hard to see what has been gained here. To prove the result, we 
still had to fall back on a formula of Fourier analysis, and furthermore the bound we 
obtain is worse than that in Proposition II. 8[ 

We may summarise the argument in Proposition 11.91 as follows, giving the two distinct 
parts a name. 

• (Generalised von Neumann theorem) The operator A3 is controlled by the Cow- 
ers [/^-norm. Specifically for any three functions fi, f2, fs '■ G — > [~1; 1] we 
have 

|A3(/l,/2,/3)|^.mf ll/.lk- 

1=1,2,6 

• (Gowers inverse theorem) If the Gowers f/^-norm of a function / : G ^ [— 1, 1] 
is large, / must have a large Fourier coefficient: 

>6 => \\T\\^>6'. 



{/2 

We note that the Gowers inverse theorem is necessary and sufficient. Indeed if ||/||oo ^ S 
then clearly ||/||4 ^ 6, and so of course ||/||{/2 ^ 6. 

This division of labour into two parts turns out to be the natural way to proceed for 
A4 (and higher operators). The first part of the argument (the definition of the Gowers 
norm and the Generalised von Neumann theorem) goes through somewhat straightfor- 
wardly. The second part (the Gowers inverse theorem) does not, since we do not know 
of a formula analagous to \\f\\u^ = ||/||4- 

Definition 1.10 (Gowers f/'^-norm). Let / : G ^ [— 1, 1] be a function. Then we define 

11/11^3 ■.=Eyi,y2,y3f{yi + I/2 + l/3)/(Z/l + I/2 + l/3)/(Z/l + I/2 + l/3)/(Z/l + 1/2 + Z/3) X 

X f{y[ + y'2 + y^)f{y'i + ^2 + y'6)f{yi + y'2 + y':yif{y'i + y'2 + 1/3) 

= E^,hi,h2MeGfix)fix + hi)f{x + /i2)x 

X fix + h3)f{x + hi + h2)f{x + hi + h)f{x + h2 + h3)f{x + h + h2 + h) . 

Note that this is a kind of sum of / over 3-dimensional parallelepipeds. We omit the 
proof that ||/||{/3 is actually a norm (see P]). 

Proposition 1.11 (Generalised von Neumann theorem for 4-term APs). Let /i, . . . , /4 : 

G -^ [—1, 1] be any four functions. Then we have 

|A4(/i,...,/4)|^.inf ||/,||^3. 

j=l,...,4 

In particular if A is not 6-uniform along four-term progressions then ||/a||i/3 ^ S. 

Proof. The idea is the same as in Proposition 11.91 Here, we find a suitable reparameti- 
sation of A4(/i, . . . , f^), and then apply the Cauchy-Schwarz inequality three times. A 
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"suitable reparametrisation" turns out to be 

A4(/l,/2,/3,/4) 

= Ej/i,j/2,y3eG/i(-|y2 - 2y3)/2(|yi - yz)h{lyi + \y2)fi{yi + 1/2 + y^)- (1-5) 

For the rest of this section let b() denote any function bounded by 1. Different oc- 
currences of b may denote different functions. The Cauchy-Schwarz inequality implies 
that 

|E,exE,6yb(x)/(x,|/)| ^ |E,exE,(„),,(i)ey/(a:,l/(°V(a^,y^'^)r^'- (1-6) 

We apply this three times. At the first application we take X := {y2, t/s} and Y = {yi}, 
and put the function /i inside the b() term. We now have variables y[ ,yl ,|/2)l/3- 
Now set X := {yj ,yl ,t/3}, Y := {t/2} and arrange for everything involving /2 to 
be placed in the b() term. We now have variables yl \y{ ,?/2 )l/2 jI/3- Fo^ the final 
application of Cauchy-Schwarz set X := {y\ ' ,y[ ,1/2 ,1/2 } ^^'^ ^ •= {l/s}) ^^^ arrange 
for everything involving f^ to be placed in the b() term. Note that at this point we 
have eliminated everything involving /i, /2, /a and have 

1^2/1,2/2,2/3/1 (-1^2 - 2y3)/2(|yi - yz)h{lyi + \y2)fA{yi + 1/2 + yz)\ 
^ |E^(o),,(i),,(o)_^a),,(o),^,i)/4(i/r + yT + yT)h{y? + yT + yf) ><■■■ 

The right-hand side here is precisely ||/4||t/3. 

To show that A(/i, /2, /s, /4) is bounded by the other expressions ||/j||[/3, one may 
proceed similarly. We leave the details to the reader. D 

We now come to the central question of quadratic Fourier analysis: when is ||/||i73 large? 
The first key observation is that the answer is not simply the same as for the f/^-norm. 

Lemma 1.12 (Key example). There is a function f : G ^ C with \\f\\oo ^ 1 such that 
\m = l, but such that ||/||oo ^ N-^/^. 



Proof. Before embarking on the proof, we must remark that || ■ Hc/s has only been defined 
for real-valued functions thus far. To define it for complex-valued functions, one must 
take complex conjugates of the terms f{x+hi), /(X-I-/12), f{x+h^) and f {x+hi+h2+h^) ■ 
The extension to complex- valued functions facilitates the discussion of examples, but is 
not otherwise essential in the theory. Keeping track of complex conjugates is rather a 
tedious affair, so will endeavour to work with real functions whenever possible. 

Set f{x) = uj^ ^ . We have 

II -f||8 _ 117 , 'jc^ x~(x+hxY (x^hx) (x+hx+h-i+hzY (x^hx+h-i+h-i) _ i 

\\}\\jjZ — ]^xMMM^ — i. 

This can be seen by intelligent direct computation (or even by naive direct computation); 
the phase vanishes since it is essentially the third derivative of a quadratic. 

To evaluate ||/||oo, observe that we have 

n 

i=i 
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This concludes the proof of the lemma. D 

We conclude this first lecture by stating three key results in additive combinatorics 
which we will need in the second lecture. These results will all be discussed and proved 
in other lectures in this school. In these results, < c < 1 < C are absolute constants. 

Proposition 1.13 (The Balog-Szemeredi-Gowers theorem). Let G be an ahelian group, 
and suppose that A ^ G is a set with \A\ = n. Suppose that there are at least 6n^ 
additive quadruples in A, that is to say solutions to oi + 02 = 03 + 04. Then there is a 
subset A' CA with \A'\ ^ c6'^\A\ such that \A' + A'\ ^ C6-'^\A'\. 

This result will be the subject of Antal Balog's lecture at the school. 

Proposition 1.14 (Freiman's theorem in finite fields). Let p be a prime, and write F" 
for the n- dimensional vector space over the finite field with p elements. Suppose that 
A O Wp is a set with \A + A\ ^ -^|^|- Then there is a subspace H ^ F^ such that 
A (1 H and for which we have the bound \H\ ^ p'-''^ \A\. 

This result will be discussed by Imre Ruzsa. 

Exercises. For the reader wishing to famiharise herself with the Gowers norms, we offer a handful of exercises. 
Discussions pertinent to these exercises may be found in the papers [9l ll7l[T8] . 

1. Let A; ^ 2 be any integer, and define the Gowers U''-noTm by 

\\f\C:=^.,,n,-.H,€G n fi^ + '^-h). (1.7) 

Show that II ■ \\^k is a norm. {Hint: first define the Gowers inner product {fi^)uie{o,i}'' fo'" 2 functions 
(/")ue{o,i}'' by modifying (|1.7p . Then use several applications of the Cauchy-Schwarz inequality to prove the 
Gowers-Cauchy-Schwarz inequality 

\{.f'^)^e{o,i}''\ ^ 11 ILfi^llijfc- 

UJ 

Finally, use this to prove the triangle inequality for ||/||(7*:). 

2. Prove that the Gowers [/''-norms are nested: 

ll/llc72^ll/llc/3^.... 

3. By generalising Lemma ll.121 show that the Gowers norms are strictly nested in the following strong sense. 
For any k ^ 3 there is Ck > such that the following is true. For any N, there is a group G with |G| ^ A'^ and 
a function f : G ^ C with ||/||^ = ||/||yfc = 1 such that \\f\\uk-i < N-'^" . 

4. We noted that the U^ inverse theorem is an if and only if statement. That is, if / is a bounded function with 
\Exf{x)u!^ ^\ ^ 5 for some r then / has large [/^-norm. Prove this without using the fact that ||/||(72 = ||/||4. 
{Hint: use the Gowers-Cauchy-Schwarz inequality of Ex. 1.) 

5. Let G = F^. Suppose that 

iTTTi /■/ \ X A'/x + rxi-^ c 

\Ea:eG.f{x)i.J \^ 5. 

for some matrix Af and vector r. Prove that ||/||(73 ^ 5. {Hint: apply the Gowers-Cauchy-Schwarz inequality 
again. You will need the generalisation of || • \\1j3 which covers complex-valued functions; this can be obtained 
by inserting appropriate complex conjugate symbols, as was discussed during the proof of Lemma [l.l2l 

6. (Generalising the generalised von Neumann theorem) Show that 

|Afc(/i,...,,ffc)| < inf ||/,||yfc-2. 

1—1, ...,fc 
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Further reading. This material was originally laid out in Gowers [5], though the notation was slightly different 
and (of course) the Gowers norms were not named as such! Various expositions of the material may be found 
in papers by one or both of Terry Tao and myself. See, for example, [17II18J . 



A very general version of the generalised von Neumann theorem (linking systems of s equations in t unknowns 
to the U"'^^ norm) may be found in our forthcoming paper ^\, and an even more general version (applying to 
functions which are not necessarily bounded by 1) may be found in [21j . 

Analogues of much of the material in this lecture were discovered in ergodic theory about 20 years ago. For 
more on this fascinating connection, the lectures of Kra in these Proceedings are illuminating. 

The Balog-Szemeredi-Gowers theorem was originally proved by Gowers 8_, and is a quantitative version of the 
earlier result of Balog and Szemeredi [l] (see also Balog's article in these Proceedings). A version with a good 
value of the exponent C may be found in [5]. This material is also covered in my notes |14j . The Pliinnecke- 
Ruzsa inequality was obtained in [25] and afforded an elegant proof by Ruzsa in |26) . The original reference 
for Proposition 11.141 is the paper [25] by Imre Ruzsa. For self-contained notes on Phinnecke's inequality and 
Freiman's theorem, see [TS]. For a discussion of all of the material in this lecture (and indeed much of the 
material in the other lectures) see the book |34j . 

2. Lecture 2 

Topics to be covered: 

• Tlie inverse tlieorem for tlie f/'^-norm on Fg. 

Some notation. Let E, E' be real-valued expressions. We will write E :»5 E' to mean 
that there is some function c{5) > such that E ^ c{6)E'. There is nothing particularly 
unusual about this notation, but one aspect of the manner in which we shall apply it is 
somewhat subtle. When we write, for example, "let A^ ^5 1", we mean "let N ^ c{6), 
where c : M+ -^ M+ is some function which may be chosen so that later arguments 
work" . We do not (of course) mean that an arbitrary function c may be chosen. 

We will also, on occasion, use the notation 05(1) to denote a finite quantity which 
depends only on 6. 

We have deliberately chosen topics within the subject of quadratic Fourier analysis for 
which bounds are unimportant, since these are the topics most allied to the "infinitary" 
ideas which feature in the lectures of Kra and Tao in these Proceedings. It is quite 
reasonable to think of there being just two types of quantity in these lectures: finite 
quantities which depend only on 6, and infinite quantities which depend on the size of 

n- 

Let us recall the main question we are trying to address. 

Question 2.1 (Gowers inverse question). Suppose that / : G ^> [— 1, 1] is a function 
and that H/Hc/a ^ S- What can we say about /? 

It turns out to be much easier to address this question in a finite field setting such as 
G = F" . We showed in the exercises to Lecture 1 that if / correlates with a quadratic 
phase u^ Mx+r x ^j^gj^ j ]-^g^g large U^ norm. It turns out that the converse is also true, 
though this is much harder to prove and will be our main goal in this lecture. 

Proposition 2.2 (Inverse theorem for the f/^-norm on F5). Suppose that f : G —>■ 
[—1, 1] is a function for which H/Ht/s ^ S. Then there is a matrix M G nJt„(F5) and a 
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vector r e¥'^^ so that 



|E.eG/(xV^ *'"+'■ 1»5l. 



Remark. Write E := sup^ jy/ \'Ex<zcf{x)uj^ Mx+r x^^ j^ jg ^^^ hard to check that the proof 
we give would allow one to replace E ^s 1 by some bound of the form E ^ exp(— C^^*"). 
For our later application, we will merely need some lower bound of the form E ^^ 1. 
There are other applications where bounds are important - see the further reading at 
the end of this lecture for a discussion. 

To prove Proposition 12.21 we will essentially follow the approach of Gowers [8]. We will, 
however, employ a slight twist which is essentially due to Samorodnitsky [29] . 

Definition 2.3 (Derivatives). Suppose that / : G ^ M is a function. Then for any 
h E G we define the function A(/; h) by 

Aif;h)ix):=fix)fix-h). 

Remark. It is convenient, though perhaps slightly mystifying, to give the name "deriv- 
ative" to this construction. If we extended the definition to complex-valued functions 
by setting A(/; h){x) = f{x)f{x — h) and applied it with f{x) = e'^'^^^^^\ the mystery 
might be reduced somewhat as the phase is indeed being differentiated. 

Proposition 2.4 (Samorodnitsky's identity). Let f : G ^ W be any function. Then 
we have 

J2 E,,+,,=,3+,JA(/; h^nr,)\' . . . |A(/; h.rir,)]' = E,||A(/; hrfs- (2-1) 

ri+r2=r3+r4 

Proof. The idea of the proof is simple: we show that both sides are equal to 

E /(ci).../(c8)/(c;).../(4), (2.2) 

(ci,...,cg,Ci,...,Cg)eC 

where the sum is over all configurations C with 

Ci H h C4 = C5 H h C8 

and 

Ci — Ci = ■ ■ ■ = Cg — Cg. 

To show that the RHS of (12.11) is equal to (12. 2p is the easier of the two tasks to accom- 
plish. One notes that 

II A(/; h)X = E.|A(/; h) * A(/; h) * A(/; x) * A(/; h){x)\^ 

by Parseval's identity and the fact that {f * g)^ = f^- That the expectation of this over 
h is equal to (12. 2p follows by expansion. 

To prove that the LHS of (12.11) is equal to (12.21) . it is convenient to introduce some 
notation. If ^ : G -^ C is a function then we define -j/;^ : G — > C by 

^''{x):=Y^ij{r)u-''^\ 
red 
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Note that the inversion formula is equivalent to 

ilr = f. (2.3) 

If ?/', : G ^ C are two functions then we define 

^l> * (f){r) := y^ ^jj{s)(f){r — s) 

and note the formula 

(Z/; * 0)V = ^V0V^ 

It follows from these facts and Parseval's identity that for any four functions gi, . . . ,g4 : 
G — i> C we have 

X^ ^i(ri)^2(r2)^(r3)^4(r4) = ^9i* ^(r)^ * g^ir) = E^gi{x)g2ix)g3{x)gi{x). 

r 

(2.4) 
gi = A{f;h,)*A{f;h,y, 



ri+r2=r3+r4 

We apply this with 



where we have defined f°{x) := /(—a;). Noting that (/°)^ = /, we see that 

Ur) = \Aif;Knr)\'. 
Substituting into (12.41) . we see that the LHS of (12.11) is equal to 

4 
i=l 

Expanding out, we recover (12. 2p once more. D 

Using this identity, we can prove the following crucial result, which provides the first 
link between functions / with large [/^-norm and quadratic phases. It states that the 
derivatives A(/; h) obey a sort of weak linearity property. 

Proposition 2.5 (Gowers). Let f : G ^ [—1, 1] be a function, and suppose that 
||/||t/3 ^ 6. Suppose that \G\ ^s 1- Then there is a function (p : G ^> G such that 

(1) |A(/; /i)^(0(/i))| >5 1 for all heS, where \S\ >5 |G|; 

(2) There are ^5 |G|^ quadruples (51,52,53,84) G S*^ such that S1 + S2 = S3 + S4 and 

0(Si) + 0(52) = 0(S3) + 0(S4). 

Proof. Set A^ := |G|. One may easily check that 

\\fC,=E,\\A{f;h)\\tj.. 
Recalling that the f/^-norm is the L^ norm of the Fourier transform, we thus have 

||/ir^3=E,||A(/;/.)^||t 
Now Holder's inequality and Parseval's identity imply that for any h we have 

||A(/; h)% ^ l|A(/; hrWTWAif; h^WT ^ l|A(/; /^)"llf • 
Another application of Holder yields 

E,\\A{f;hnf <: {E,\\A{f-hniY^'- 
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Combining these observations, we conclude that 

Samorodnitsky's identity then allows us to conclude that 

J2 E,,+,,=,3+,JA(/; hnr,)\' . . . |A(/; M^WP > 5''- (2.5) 

ri+r2=r3+r4 

To each /i G G, we associate the set $(/i) of characters r for which |A(/; h)^{r)\ ^ S^^. It 
is immediate from Parseval's identity that |$(/i)| ^ S~^^^ for all h. Now the contribution 
to (12. 5p from those hi, r^ for which ri ^ $(/ii) (say) is bounded by 

r2,r3,r4. 

It follows that 

''1+^2 =^3 +''4 

and so in particular there are at least 6^'^N^/2 additive octuples {hi, ri, . . . , h^, r^) such 
that hi + h2 = h^ + h^, ri + r2 = r^ + r^ and r^ G $(/ii) for i = 1, . . . , 4. We say that 
an octuple is proper ii hi, . . . , h^ are all distinct. The number of our additive octuples 
which fail to be proper is clearly <^s N"^ and hence, since N is so large, at least S'^'^N^/4 
of them are proper. 

Let 5* be the set of all h for which $(/i) 7^ 0. It is easy to see that \S\ ^s \G\, since 
otherwise there could not be enough additive octuples. For each h G S, pick an element 
4>{h) uniformly at random from $(/i), and suppose that these choices are independent 
for different h. For each proper additive octuple {hi,ri, . . . , h/i, r^), the probability that 
it fits (f), that is to say that r^ = (j){hi) for i = 1,2,3,4, is precisely l/|<l'(/ii)| . . . |$(/i4)|. 
This is ^s 1. It follows that the expected number of additive octuples which fit (p is 
^5 \G\^. In particular there is some specific choice of (p for which ^ \G\^ additive 
octuples fit (j). 

It takes a few seconds to realise that we have, in fact, proved the result. Indeed, an 
octuple which fits is precisely an additive quadruple of points hi, . . . ,hi such that 
(t){hi) + 0(/i2) = <p{h^) + 0(/i4) and 0(/ii) G ^hi), that is to say |A(/, /i)^(0(/i))| ^ 5^^. 
U 

We have made a crucial step: assuming that H/Ht/s was large, we deduced that the 
derivative of / has a certain weak linearity property. We must now work with this 
property and make it somewhat stronger. 

Proposition 2.6 (From weak linearity to linearity). Suppose that cp : G ^ G is a 
function with the property in Proposition \2.5\ (2), that is to say there is some set 5* C G 
with \S\ S>5 |G| such that there are ^5 |G|^ additive quadruples (51,52,53,^4) such that 
5i + 52 = 53 + 54 and (f){si) + 0(52) = 0(53) + 0(54). Then there is some linear function 
^(x) = Mx + b, where M G TlniE^) and b G F^, such that 0(x) = ipix) for >5 |G| 
values of X G S. 
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Proof. The first step is to observe that the conclusion of Proposition 12.51 may be 
rephrased using the graph 

r:={{h,<p{h)):heS}, 

which is a subset oi G x G. Statement (2) of Proposition 12.51 is just the same as saying 
that r has ^5 \G\^ additive quadruples. It follows from the Balog-Szemeredi-Gowers 
theorem that there is a subset F' C F with 

|F'| >5 |F| >5 IGI 

and 

|F' + F'| <5 |F'|. 

Define ^' C ^ by 

T':={{h,<j>{h)):heS'}, 
and note that 

\S'\ »5 IGI. 

Now we may identify G x G with F5 x FI? and hence with F5". From Ruzsa's finite field 
analogue of Freiman's theorem, it follows that there is some subspace H ^¥^ x F5, 

\H\ <5 \G\, (2.6) 

such that F' C H. 

Consider the map tt : H -^ G onto the first factor. The image of this linear map 
contains S", and so from (12. 6p and the lower bound for \S'\ we see that 

dimpg kervr ^5 1. 

It follows that we may foliate H into -C^ 1 cosets of some subspace H', such that vr is 
injective on each of these cosets. By averaging, we see that there is some x such that 

\ix + H')nT'\:^s\G\. 

Set F" := (x + H') fl F', and define S" C S' accordingly. Then 7i\x+h' is an afiine 
isomorphism onto its image V, which means that there is an afiine linear map ip : V ^ G 
such that (s", V(s")) e F" for all s" e S", that is to say V(s") = H^") for all s" e S"n 

Let us put this last result together with Proposition 12.51 

Corollary 2.7 (Linearity of the derivative). Suppose that f : G —^ [—1, 1] is a function 
with ||/||{/3 ^ 5. Suppose that \G\ 3>5 1. Then there is some M G 9Jt„(F5) and some 
6 e F^ such that 

E;,|A(/;/i)^(M/i + 6)|2»5l. 

Proof. Recall that is defined for h & S, where 

\S\ »5 |G| 
and that it has the property that 

\Aif;hrWh))\:^si 

for all h E S. We proved in Proposition 12.61 that there is an afiine linear function 
ilj{h) = Mh + b such that 0(/i) = ^(/i) for all h e S", where \S"\ >5 \G\. The corollary 
follows immediately. D 
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Corollary 12.71 shows that the derivative of a function / with large U^ norm correlates 
with a linear function. Recall that our aim is to show that / correlates with a quadratic 
function x ^-^ uj^ Mx+r x_ 'jj^jg latter function does have a linear derivative, but this 
derivative is symmetric. For that reason we need the following lemma, which states that 
the matrix M in Corollary 12.71 is automatically nearly symmetric. 

Lemma 2.8 (Symmetry argument). Suppose that f : G ^ [—1,1] is a function, that 
M G mn(¥5),and that beW^. Suppose that 

Eh\A{f;h)\Mh + b)\^:>sl. 

Then M is approximately symmetric in the sense that 

rk(M - M^) <5 1. 

Proof. Write D = M — M^. Expanding the assumption gives 

E.,,,,/(x)/(x - h)f{y)f{y - /,)^(-J/rMh+(.-,)-6 ^^ ^^ 
Making the substitution z = x + y — h, this becomes 

E.,y,J{x)f{z - x)f{y)f{z - y)J^-y)-Mi.+y^.)M.-yVb ^^ ^^ 

which can be written 

Here, we have written 

A'(/;z)(t):=/(t)/(z-t). 
Writing 

we have 

Averaging over z, we see that there is some function g : G ^ C with ||5'||oo ^ 1 such 
that 

that is to say 

\E^g{x)g{Dx)\ >5 1. 
This implies that 

EMDx)\:^si, 

and so in particular there are ^5 \G\ values of x such that I'g^Dx)] ^5 1. However we 
know from Parseval's identity that the number of r such that \'g{r)\ ^s 1 is -C^ 1- Thus 
there is some set S* C F5 with \S\ ^^5 \G\ and |-D(S')| <tis 1- This implies that 

|ker(D)|>5|G'|, 

which immediately implies the result. D 

We have shown that if H/Hc/s is large then the derivative of / correlates with a sym- 
metric linear form. To complete the proof of Proposition 12. 2[ we must "integrate" this 
statement and show that / correlates with a quadratic. We give this integration now. 

Proof of Proposition \2.^ From Corollary 12.71 and Lemma 12.81 we know that 

E,|A(/;/i)^(M/i + 6)|2»5l, (2.7) 
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where 

rk(M - M^) <5 1. 

Write Msym := |(M + M"^), and let ^ := ker(M - M^). For each t e G there is some 
bt such that we have 

Mh + b = M^yrah + ht. 

for all /i G V^ + t. By a trivial averaging argument and the fact that codim(V^) <^s 1, 
we may find a t such that 

E;,l,gV'+i|A(/;/i)^(M/i + 6)|2»5l. 

This of course implies that 

Ea/.ey+t|A(/; h)\M,y^h + 6i)|2 >5 1, 

and hence by positivity that 

By redefining M to be Mgym and 6 to be 6(, it follows that we may assume in fl2.7p that 
M is symmetric. 

Expanding out fl2.7p we obtain 

^h,.J{x)f{x - h)f{y)f{y - h)uj'^^Mi.-y)+b^i.-y) ^^ ^ 

Substituting y := x — k, we obtain 

Eh,,,kf{x)f{x - h)f\x - k)f{x -h- k)u^^'"^^'^'^ », 1. 

Using the identity 

x^Mx - (x - KfM{x -h)-{x- kfM{x - k) + {x - h - kfM{x ^ h - k) = 2h^Mk, 

this may be written as 

'^h,x,k9i{x)g2{x - h)g3{x - k)gi{x - /i - A;) >5 1, (2.8) 

where gi{x) := f{x)u!2^^'^^^, g2{x) := /(x)c<j~2^^*^^-''^^^ g2,{x) := /(x)u;~2^^^^ and 
gA{x) := f{x)uj^^ Mx-b x_ jsj-Q^g ]^\^Qj^ ^]-^g functions 5'2,5'3,5'4 are bounded by 1; this is, 
in fact, the only property of them that we shall use. 

Now the left-hand side of (12.81) may be rewritten using the Fourier transform as 

^gi{r)g2{-r)gz{-r)gi{r). 

r 

It follows immediately frrm Holder's inequality that 

WgiWi >5 1, 
which, since ||^i||2 ^ 1, implies that 

||?l||oo >5 1, 

that is to say there is some r G Fg such that 

|E^/(x)cu5^^^^+'^^^| >5 1. 
This, at last, completes the proof of Proposition 12.21 D 
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Remark. In going from (12.81) to the end of the proof, what we have really done is apply 
the Gowers-Cauchy-Schwarz inequality (cf. the exercises following Lecture 1) and the 
inverse theorem for the t/^-norm. 

Further reading. The orginal argument of Gowers is in :8^. This took place in the group G = Z/NZ, not in a 
finite field model, and did not quite give a necessary and sufficient inverse theorem for the [/^-norm. It was 
instead shown that if / : Z/NZ — > [—1, 1] has large U^-norm then / correlates with a quadratic polynomial 
on some subprogression of length a power of A'^. This is a "local" statement, and as such is much weaker than 
having large t/^-norm, which is "global', i.e. involves averaging over the whole group G. 

To get an inverse theorem, one extra ingredient must be added to Gowers' work. This is the symmetry 
argument. Lemma 12.81 It was first given in [18j . That paper gives an inverse theorem for the [/^-norm in any 
finite abelian group of odd order. To even state the result is somewhat complicated, and we defer a discussion 
until we have thoroughly examined the finite field case. An inverse theorem for the C/^-norm in F2 was given 
by Samorodnitsky 129;, using the method we have described but with a slight twist to enable him to handle 
characteristic 2. It is very likely that a combination of his methods and ours would allow one to prove an 
inverse theorem in any finite abelian G, but to my knowledge no-one has yet undertaken this task. 

As we remarked, one may replace our ';^s 1 notation with more precise bounds, ending up with a version 
of Proposition 12.21 with a function of the form exp(— C5~ ) on the right-hand side. It would be of great 
interest to know whether this could be improved, perhaps even to cS'-" . This would follow from the so-called 
Polynomial- Freiman-Ruzsa conjecture, the finite field version of which is discussed in [TJ. 

The strongest known inverse result for the U'^ norm on F5 is the following, proved in [IHj . 

Proposition 2.9 (Inverse theorem for the [/'^-norm on F5 , II). Suppose that / : F5 ^ [— 1, 1] «s a function 
for which WfWu^ ^ ^- Then there exists a subspace H ^¥^ with codim(fl") ^ CS~ , together with a system of 
quadratic forms ryX + x^ MyX indexed by the cosets y + H of H, such that 

E,|E,6^+„/(x)c^^^*^«^+'-^| ^ c5^. 

Note that the amount of correlation is c5 rather than exp(— C5^ ), but one must pass to a coset of a subspace 
of somewhat large codimension. 



The proof of this result is rather longer than that of Proposition 12.21 and involves a good deal more machinery 
(Bogolyubov's method and Freiman homomorphisms) . This stronger result is necessary for certain applications, 
for example in our paper [T^ in which it is shown that r4(F5) <C Ai'(log A*')"". 



3. Lecture 3 



Topics to be covered: 



• Quadratic factors 

• The energy increment lemma 

• The idea of approximating a function by projecting onto a low-complexity factor 

• The Koopman-von Neumann decomposition 

• The arithmetic regularity decomposition 

Our main effort so far has been devoted to proving a result of the form "if H/Hc/s is 
large then / has a large quadratic Fourier coefficient" . 

In this section we turn to a discussion of how this kind of information can be useful to 
us. There are many instances in additive combinatorics where study of a single Fourier 
coefficient is fruitful. However there are many other occasions on which it is beneficial 
to consider several Fourier coefficients of /, say the set of large Fourier coefficients of 
/. We must develop analogues of this theory in the quadratic setting. 
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From now on, matrices M G 9Jl„(F5) will only appear in quadratic forms x^ Mx. Thus 
from this point onwards it is natural to adopt the convention that all matrices are 
symmetric. We note that a (slightly) more high-brow approach to the whole theory, 
avoiding the use of bases, appears in our paper [I9j . 

The following simple lemma will be used over and over again. 

Lemma 3.1 (Gauss sums). Suppose that M is symmetric and that ikM = d. Then for 
any r E G we have 

If r = then equality occurs. 
Proof. Squaring, we obtain 

iTij. x'^Mx+r'^x\2 m h'^Mh+r'^hTw ,2h'^Mx 

\liixeG^ I = ii^h^ i^x^ 

The inner sum is zero unless h G ker(M). This occurs with probability 5~^', and so we 
do indeed get 

If r = then the phase uj^' ^'^^^"^ ^ is actually equal to 1 when h G ker(M), and so 
equality occurs. D 

Using this lemma, we may highlight one of the immediate difficulties with formulating 
"quadratic Fourier analysis". 

Lemma 3.2 (Profusion of large QFCs). Let / : F^ — > [—1, 1] be a function. Then there 
at most 6^"^ values of r for which 

|/(r)| = |E.eF^/(x)a;^"l^(5. 

However, the number of pairs (M, r) such that 

|E,eF^/(x)^^"^^^+'^"^| ^ 5 

need not be bounded in terms of 6. 

Proof. The first statement, which is included for comparison with the classical setting, 
is immediate from Parseval's identity. To illustrate the second, one may consider a 
function as simple as f{x) = 1. For any symmetric matrix M with rk(M) ^ logc^[l/6), 
we have 

|E,,eF?^/(a;)cu^"*^^| ^ 6. 
The number of such matrices is not bounded in terms oi 6. D 

This lemma suggests that we should perhaps only consider QFCs as "essentially differ- 
ent" if they are not too close in rank. This turns out to be a useful idea, and we will 
return to it later when we are in a position to formulate it properly. 

As we said there are many arguments (e.g. [HI [101 ESI [30]) where one considers the set 
of 5-large Fourier coefficients 

Spec,(/):={rGFM/(r)|^5}. 
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Without going into details of the apphcations, let us describe a useful way to think 
about the way this construction is often used. 

Definition 3.3 (Factors). Let 0i, . . . , 0a; : IF5 -^ F5 be any functions. These functions 
describe a a-algebra i3 on F5, the atoms of which are sets (of which there are at most 
5'^) of the form {x : 0i(x) = Ci, . . . ,(pk{x) = Ck}- If / : F5 ^ C is a function then 
we often consider the conditional expectation E(/|i3). Note that E,{f\B){x) is just the 
average of / over the atom B{x) which contains x. We will usually refer to a-algebras 
arising in this way as factors, by analogy with ergodic theory. We say that a factor B' 
refines B if every atom of B' is contained in an atom of B. Thus B' is at least as fine a 
partition of F5 as B is. 

Definition 3.4 (Linear factors). Suppose that ri,...,rk G F5. Then the a-algebra 
B whose atoms are the sets {x : rfx = Ci,i = l,...,k} is called a linear factor of 
complexity at most k. 

Proposition 3.5 (Linear Koopman-von Neumann decomposition). Let / : F" ^ [— 1, 1] 
be a function and let 6 > be a parameter. Then there is a linear factor B of complexity 
at most A6~'^ such that 

/ = /l + /2, 

where 

fi--=nm) 

and 

\\f2\\u^^5. 

Remark. The Koopman-von Neumann theorem may be described in words as "any 
bounded function is the sum of a "low complexity" function formed by projecting onto 
a linear factor, and a "uniform" function which is small in f/^. 

Proof. The proof we give uses Fourier analysis, and does not generalise to give a result 
for the f/'^-norm. We include it to justify the fact that this is a proposition which 
encodes the notion of "taking all the large Fourier coefficients of /" . 

Write Tj := 5^/2. Let S := Spec^(/): note that by Parseval's identity we have \S\ ^ 4(5""^. 
Let H = S^ he the annihilator of / and write fin for the Haar measure on H, that is 
to say fj,H '■= In/El/f . Define /i := / * fin and f2 '■= f — f * I^h- It is not hard to see 
that /i = E(/|i3), where B is the factor defined by the linear functions r'^x, r E S. To 
conclude the proof, we only need check that II/2II00 is small. To that end, we have 

1/2(01 = |/(r)||l-/2^(r)|. 

If r G Spec^(/) then fiH^r) = 1, and so f2{r) = 0. If r ^ Spec^(/) then by definition we 

have |/(r)| ^ 77, and so |/2(^)| ^ 2ri in this case. It follows that II/2II00 ^ Sr^, and thus 
by the inverse theorem for the [/^-norm we have ||/2||[/2 ^ V^- The result follows. D 

Definition 3.6 (Quadratic factors). Let ri, . . . , r^^ e F5 be vectors, and let Mi, . . . , M^j 
G 9K„(F5) be symmetric matrices. We write Bi for the linear factor generated by the 
rjx. Write B2 for the a-algebra generated by the functions rjx and the pure quadratic 
functions x^MjX. Clearly B2 refines Bi. We call the pair {Bi,B2) a (homogeneous) 
quadratic factor of complexity (di, ^2). 
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Proposition 3.7 (Quadratic Koopman-von Neumann decomposition). Let (B[ , iSg ) 

be a quadratic factor with complexity at most {d[\d2 )■ Let / : F5 ^ [~1) 1] ^^ '^ 
function and let 6 > be a parameter. Then there is a quadratic factor (^1,^2) of 
complexity at most {d[ + 05(1), ^2 + ^^(l)) which refines {B[ ,82 ), and such that 

f = fl + /2, 

where 

/i:=E(/|i32) 
and 

\\f2\\m^5. 

Remark. For applications in which bounds are unimportant, it is better to apply the 
arithmetic regularity lemma which we will give later. A version of the Koopman-von 
Neumann theorem with reasonable bounds is the key tool in [19]. In that application 
we take {B[ , iSg ) to be the trivial factor. 

The key to proving the Koopman von Neumann decomposition lies in the following 
result. 

Lemma 3.8 (Energy increment). Let {Bi,B2) be a quadratic factor of complexity at 
most {di, ^2), and let / : F5 — > [—1, 1] be a function such that 

\\f-E{f\B2)\\m^S. 

Then exists a refinement {B[, B'^) of (i3i, B2) of complexity at most {di + 1, ^2 + 1) such 
that we have the energy increment 

||E(/|S^)||^^||E(/|i32)||^ + c((5), (3.1) 

where c : (0, 1) — > M+ is some non- decreasing function of 6. 

Proof. The function g '■= f — K{f\B2) is certainly bounded by 2, so we may apply the 
inverse theorem for the f/'^-norm (Proposition 12. 2p to conclude that there is a quadratic 
x'^Mx + r'^x so that 

|E^^(a;)u;^^*^^+"^^| ^ c{6). (3.2) 

We may clearly assume that c : (0, 1) —>■ M.^ is a non- decreasing function. The linear part 
r'^x and the pure quadratic part x'^Mx of this quadratic together induce a quadratic 
factor {Bi, B2) of complexity (1, 1). 

Now since x'^Mx + r^x is i32-measurable, it is clear that 

E,(7(a;)cu"^*'"+''^" = E,E(f?|S2)(x)cu"^*''"+^^", 

In particular, (13.21) implies that 

||E(^|^2)||i^c(5). (3.3) 

Now define B'^ := BiM Bi and B'2 '■= B2\/ B2. Again, the meaning of this is the obvious 
one; simply intersect all the atoms of Bi with those of Bi. It is clear that {B'^^ B'2) is a 
quadratic factor of complexity at most (rfi -|- 1, ^2 + !)• 

It remains to establish the energy increment (13.11) . A key tool is 
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Pythagoras' theorem. Suppose that B, B' are two a-algebras on F^ such that B' 
refines B. Let / : FI* ^ [~1; 1] be any function. Then 

iiE(/|i3')ii^ = mfm\i + mm') - nmni 

Now we have the chain of inequalities 

mf\B',)\\i - \m\B2)\\i = mm',) - nm2)\\i 

= m9\Bj)\\i 
>mm2)\\i 
>\m9\mi 

> c{5). 

The justification of these five fines uses respectively Pythagoras' theorem, the fact that 
B2 refines B2, Pythagoras' theorem together with the fact that B'2 refines B2, the Cauchy- 
Schwarz inequality, and fl3.3p . D 



Proof of Proposition\3l\ Start with (B^B,) = (i3!°\4°^). If 

\\f-E{f\B2)\\u-^KS (3.4) 

then STOP. Otherwise, we may apply Lemma 13^ to extend {Bi, B2) to a quadratic factor 
with complexity incremented by at most (1, 1) and the energy ||E(/|i32)||2 incremented 
by at least c{6). If (13.41) holds then STOP, otherwise repeat the process. Since / is 
bounded, the energy ||E(/|;B2)||2 li^s in the interval [0,1]. Since c : (0,1) -^ M+ is 
non-decreasing, we cannot iterate the above procedure more than l/c{6) times before 
we STOP. The claim follows. D 

We will not give an application of the Koopman von-Neumann decomposition, since the 
interesting applications require quantitative versions of the result (cf. [L9\)- The result 
has a significant shortcoming, which is that the uniformity parameter S need not be 
small in terms of the complexity of {Bi, B2). For such situations there is another type 
of decomposition, which we call the arithmetic regularity lemma because of an analogy 
with Szemeredi's regularity lemma in graph theory. We note that any use of this type 
of decomposition necessarily results in terrible "tower-type" bounds: see for example 
[TJIll]. As we have stated, however, bounds are not our concern in these lectures. 

Proposition 3.9 (Arithmetic regularity lemma for U^). Let 6 > be a parameter, and 
let CO : R+ -^ R+ be an arbitrary growth functioiu (which may depend on S). Suppose that 
n > no{uj,5) is sufficiently large, and /et / : F5 ^ [— 1, 1] be a function. Let {Bl ,B2 ) 
be a quadratic factor of complexity (d[ ,d2 )■ Then there is C = C{5,uj,d\ , (^2 ) o.f^d 
a quadratic factor {Bi,B2) which refines {B[ , i^a ) and has complexity at most {d,d), 
d ^ C, together with a decomposition 

f = fl + f2 + /3, 



The use of arbitrary growth functions reaUy does put us in the domain of "discrete analogues of 
infinitary mathematics" . The arithmetic regularity lemma is indeed very close in spirit to the main 
result of the ergodic-theoretic paper |2 ■ 
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where 

fi:=Eif\B2), 

\\f2h^S 

and 

ll/allt/a ^ l/oo{d). 

Proof. Apply the Koopman-von Neumann theorem iteratively, with parameters Si, 
i = 1,2,... to obtain quadratic factors (i3| , -62 ) with complexities at most (Cj,Cj) 
such that 

• {B?,B?) is a refinement of (i3f~'\ 4'"'^); 
. ||/-E(/|i3«)||t/3^<5,; 

• Ci is bounded above in terms of Ci-i and 6i. 

Choose the sequence of SiS such that (5i+i ^ l/uj{Ci) for all i. Since Ci is bounded above 
by a quantity depending only on Si, ... ,6i, this is certainly possible. 

Now the energies ||E(/|i32 )||2 are non-decreasing, and are all bounded by 1. By the 
pigeonhole principle there is therefore some i ^ \S~'^'\ such that 

For such an i, we may take for our decomposition 

/i:=E(/|i3«), 

and 

fs:=f-E{f\Bt'^). 

It follows from Pythagoras' Theorem that II/2II2 ^ ^j as required. D 

What is the point of the Koopman von Neumann and arithmetic regularity results, say 
for the [/^-norm? The answer is that they often reduce the study of general functions 
(say from the point of view of counting 4-term arithmetic progressions) to the study of 
projections E(/|i3) onto "low-complexity" quadratic factors. This, however, is of little 
consequence unless we can study those supposedly simple objects. 

Definition 3.10 (Rank of quadratic factors). Suppose that {Bi,B2) is a quadratic 
factor of complexity {di, ^2), being defined by di linear forms rjx, . . . , rj^x and ^2 pure 
quadratics x'^Mix, . . . , x'^ Md^x. We say that {Bi, B2) has rank at least r if 

rk(AiMi + --- + Arf,MrfJ ^r 

whenever Ai, . . . , A^j are elements of E5, not all zero. 

When we are not concerned with bounds, it turns out that we may assume our quadratic 
factors have exceedingly large rank. We will see in the next lecture that factors with 
high rank are much easier to handle than factors with small rank. 
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Lemma 3.11 (Making factors high-rank). Let uj : M_|_ — >■ M+ he an arbitrary growth 
function. Then there is another function t = t^^ with the following property. Let {Bi, B2) 
be a quadratic factor with complexity at most {di, ^2)- Then there is a refinement {B[, B'2) 
of{Bi, B2) with complexity at most {d[, ^2), where d[ ^ T{di, ^2); which has rank at least 
uj{d[ + d'2). 

Proof. Suppose as usual that {Bi,B2) is described by di hnear functions rjx, . . . ,'rj^x 
and d2 "pure quadratics" x'^Mix, . . . jx'^M^^x. Suppose that {Bi,B2) does not have 
rank at least u;(c/i + ^2)- Then there is some relation 

rk(AiMi + ■ ■ ■ + Ad.MdJ ^uj{d), 

where we may assume without loss of generality that X^^ = 1- Let si, . . . ,Sk, k ^ uj{d) 
be a basis for ker(?7)-'-, where U := AiMi + ■■■ + A^jM^j, and let {Bl,Bl) be the 
homogeneous quadratic factor defined by the linear forms rjx, . . . , rj^x, sjx, . . . , s^x 
and the quadratic forms x'^Mix^ . . . , x'^M^^-ix. It has complexity bounded by (rfj, ^2 — 
1), where (i| ^ (i + uj{di + ^2). The value of x'^M^^x is determined by the values of the 
x'^MiX, i = 1, . . . ,d2 — 1 together with the value of x^Ux. This in turn is determined 
by the coset of ker(f/) that x lies in, and hence by sf x, . . . , s^x. It follows that {BI, BI) 
refines {Bi,B2). 

Now we ask whether (BIjBD has rank at most uj(dl + ^2). If so, we refine again, 
obtaining a new factor (i3| , BI) with complexity bounded by (rfj + uj{dl + ^2), ^2 — 2). 
This procedure can last no more than ^2 steps, however, since at each stage the number 
of pure quadratic phases is reduced by one. We may take {B[, B'2) to be the factor that 
we have when the procedure terminates. D 

Proposition 3.12 (Arithmetic regularity lemma for U^, II). Let 6 > be a parameter, 
and let uji,uj2 : M+ -^ IR+ be arbitrary growth functions {which may depend on S). 
Let n > nQ{6, 001,002) be sufficiently large, and let f : ¥^ ^ ["l? 1] ^^ 0, function. Let 
{B[ , ^2 ) be a quadratic factor of complexity {d[ , rfg ). Then there is a quadratic 
factor {Bi, B2) with the following properties: 

(1) (^1,^2) refines iB^^\B^^^); 

(2) The complexity of {Bi,B2) is at most ((^1,^2), where 

dud2 ^ C{6,uJi,uj2,df\d^2^), 

for some fixed function C ; 

(3) The rank of {Bi, B2) is at least ooi{di + ^2); 

(4) There is a decomposition f = fi + f2 + fs, where 

/i:=E(/|S2), 

11/2112^5 
and 

11/311^73^1/^2(^1 + ^2). 

Remark. The formulation is very similar to that in Proposition 13.91 but we now insist 
that the factor [Bi,B2) be homogeneous, and also include a condition on its rank. The 
statement of Proposition 13.121 will look complicated at first sight, but there is nothing 
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much to be scared of. As always with comphcated propositions, it is as well to attempt 
to formulate what has been proved in a somewhat looser, wordier way. Here is an 
attempt: 

Let f be any function on F5. Then, up to an error which is small in L^ , we may 
write f as a sum of a function which is measurable with respect to a bounded complexity 
quadratic factor, plus an error which is miniscule in || ■ ||^3. Furthermore we may insist 
that the rank of the quadratic factor is huge in comparison to its complexity. 

Proof. Apply Proposition 13.91 to get a factor (i3i,i32) refining {B[ ,82 ), and a de- 
composition / = /i + /2 + /g such that /i = E{f\B2), II/2II2 ^ S/2 and WfsWu-i ^ 
l/a;2(r(c?i, ^2) + ^2), where (^1,^2) is an upper bound for the complexity of (;Bi,i32) 
and r = r^^ is the function appearing in Lemma 13.111 Using that lemma, we may 
refine (Si,;B2) to a quadratic factor {B[,B'2) with complexity at most (^'^,^2), where 
d[ ^ T{di, ^2) and (^2 ^ ^2, and with rank at least uji{d[ + ^2)- Define a new decompo- 
sition f = f[ + f^ + f^, where 

/i:=E(/|S^), 

f^:=f2 + E{f\B2)-E{f\B'2) 

and /s = /s. Either this has the desired properties, or else we have 

\\E{f\B2)-E{f\B'2)h>6/2. 
By Pythagoras' theorem this leads to the energy increment 

\\E{f\B'2)\\l>\\E{f\B2)\\l + 6y4. (3.5) 

In this eventuality we apply Proposition 13.91 again, initialising with {B[ \B2 ) ■ = 
{B[,B2)- In view of the energy increment fl3.5p . we can only repeat this [4/(5^] times 
before we reach a decomposition with the properties we desire. D 

Further reading. There is a wealth of directions to go in. Results of Koopman von Neumann type go back, 
implicitly, a long way. The name was first given, by Tao and I, to a result in our paper [17] on primes in AP. 
That result was somewhat different to the results here, but the method of proof (the energy increment strategy) 
is the same. 

The arithmetic regularity lemma for the f/'^-norm will be the subject of a forthcoming paper by Tao and I 
[22] . There is, of course, an analogous result for f/^-norm, and this was implicit in Bourgain !^'. The proof 
there used the Fourier transform rather than the energy-increment strategy. A substantially more difficult (!) 
proof of the same result was given 15 years later by me [TT]; a number of applications were given there. The 
energy-increment proof of Proposition 13.91 seems at the moment to be the "right" way to think about these 
issues, and is essentially the approach taken in |32j . 

There are connections with regularity results for graphs and hypergraphs, the first result of this type being 
Szemeredi's regularity lemma |31) . There are also parallels with results in ergodic theory such as [5]. Perhaps 
it is best to refer the reader to the lectures by Kra and Tao at this school. The ICM article by Tao [33j has 
many references and would represent a fine place to begin further investigations. 



4. Lecture 4 

Topics to be covered 

• Working on a quadratic factor; the configuration space. 
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• A theorem on progressions of length 4: an example of how to put all the ingre- 
dients together. 

Our aim in this lecture is to prove the following theorem by using the machinery we 
have developed. Recall that we are writing A^ := 5". 

Theorem 4.1 (G.-Tao). Let a, e > be real numbers. Then there is an Uq = no(a,e) 
with the following property. Suppose that n > no(«,e), and that A C Fg is a set with 
density a. Then there is some d ^ such that A contains at least (a"^ — e)N four-term 
arithmetic progressions with common difference d. 

Remarks. It is easy to see that one cannot replace a^ by anything larger, by considering 
a random set of density a. This theorem has, as a consequence, a version of Szemeredi's 
theorem for progressions of length four in finite fields, namely r4^{¥^) = o{N). The 
theorem is a finite field version of a conjecture of Bergelson, Host and Kra. Rather 
bizarrely at first sight, this result does not generalise to progressions longer than four. 

Now in the last lecture we worked rather hard in order to show that, in various senses, 
the study of an arbitrary function / : F5 — > [—1, 1] can be reduced to the study of a 
i32-measurable function E(/|i32), where (i3i,i32) is a quadratic factor with "bounded 
complexity" and high rank. To make use of this, we need to be able to understand 
i32-measurable functions. At the very least, we are going to want to know about the 
size of the atoms in B2 and, for any four atoms, the number of four-term progressions 
spanned by those atoms. It turns out that the "high-rank" assumption allows us to 
simply compute these quantities using Fourier analysis. 

Suppose, throughout this lecture, that (i3i, B2) is a quadratic factor defined by di linear 
forms rjx and 0^2 pure quadratics x'^MjX. (Recall that Bi is the a-algebra generated 
by the linear functions, and B2 is the a-algebra generated by the linear and quadratic 
functions.) We will always suppose (as we clearly may) that the vectors rj are linearly 
independent. 

To understand i32-measurable functions, that is to say functions which are constant on 
atoms of B2 (or alternatively functions which have the form K{f\B2)), it is helpful to 
work in configuration space F5' x Fg^ . We write T : FI> — > F5' and $ : FI> -^ Fg^ for the 

T , 



maps r(a;) := (rfx, . . . , rJ a;) and <I>(a;) := (x-^MiX, . . . , x-^M^jX 



Lemma 4.2 (Size of atoms). Suppose that {Bi,B2) has rank at least r. Let {a,b) G 
Fg^ X Fg^. Then the probability that a randomly chosen x E ¥^ has T(x) = a and 
^{x) = b is 5-'^^-'^^ + 0(5"''/2). 

Remark. In this lemma and the next, the probabilistic language is present only to avoid 
normalising factors of A^ = 5". This is really a statement about the number of x with 
r(a;) = a, <l>(x) = b. 

Proof. The quantity in question is given by 

di d2 
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which rearranges as 

(4.1) 
Now the rank of {Bi, B2) is at least r, which means that 

rk(AiMi + ■ ■ ■ + A^.MrfJ ^ r. 

In view of the Gauss sum estimate, Lemma 13.11 this means that every term in (14.11) in 
which the Aj are not all zero is bounded by 5-<^i-'^2-?-/2 q^ ^^le terms with Ai = ■ ■ ■ = 
Arf2 = 0, the linear independence of the r^ guarantees that the only term which does not 
vanish is that with /ii = ■ ■ ■ = fidi = 0. The result follows immediately. D 

Lemma 4.3 (4-term progressions). Suppose that (i3i,i32) has rank at least r. Suppose 
that (a^^\b^^^), . . . ,(a'^'^\b^'^^) G F^^ x F^^. Suppose that a 4-term progression (x,x + 
d,x + 2d, X + 3d) G (F5 )^ is chosen at random. If 

a^^\ a^'^\ a^^\ a^^^ are in arithmetic progression (4-2) 

and 

bW _ 35(2) + 35(3) + 5(4) ^ Q (4 3) 

then the probability that T{x + id) = a^\ $(x + id) = b^'^ for i = 1, 2, 3, 4 is 5-2^1-3^2 _^ 
0(5^''/^). Otherwise, it is zero. 

Proof. The important thing to appreciate here is that four elements in different atoms 
of B2 can only lie in arithmetic progression if the two constraints (14.21) and (14. 3 p are 
satisfied. Furthermore these are the only relevant constraints, in that if they are satisfied 
(and if the factor (i3i, B2) has large rank) then we can accurately count the number of 
four-term progressions involving those atoms. 

The necessity of the constraints (14.21) and (14. 3 p is easy. If {x,x + d,x + 2d, x + 3(i) is an 
arithmetic progression, we need only observe that T{x),T{x + d), T{x + 2d), T{x + 2id) are 
also in arithmetic progression, and that $(x) — 3$(x + d) + 3$(x + 2d) — $(x + 3d) = 0. 

To obtain the statement about probability, we proceed in the same manner as in Lemma 
14.21 The notation here is, however, somewhat fearsome. We start with the observation 
that the probability in question is 

and then swap the order of summation to rearrange as 

r-4di-4d2 V^ Tg x^Px+2x^Qd+d'^Rd+u'^x+v'^d-w /a a\ 

where 

P = P(A) = f^iXf + Af ) + Af ) + Af )M„ 
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Q = g(A) = J2i^f^ + 2Af + 3Af + 4Af )M„ 
i=i 

R = R(X) = ^(A« + 4Xf + 9Xf + 16Xf)M^, 
u = «(/i) = f;(/i« + /if) + /if + /if V. 

i=l 

(1) , 0,,(2) , g,,{3) , . (4)^ 



(/i) = $:(/if) + 2/if + 3/.f + 4/if))r. 



V = V ^^ 

i=l 



and 



w=wi,,x)=tf:^f'-f'+tt>^?f- 

1=1 i=l 1=1 j=l 

We use Lemma 13.11 repeatedly. By fixing either x or d, we see that the inner sum in 
(14. 4p (that is, the expectation over x,d) is 0(5~^/^) unless 

Xf + \f + Af ) + Xf = Xf^ + 4Af ) + 9Af ) + 16Af = 0, (4.5) 

in which case certainly P = R = 0. In this case, the inner sum is a rather purer-looking 

Ea.,d(^'''^^°'+"^''+''^'^"'". (4.6) 

For fixed d, this is zero unless Qd + u = 0. If Xf^ + 2Xf + 3Af ^ + 4AJ.^) ^ then, since 
rk(Q) ^ r, this cannot happen for more than 5"'' of all d, and (14. 6 p is bounded by 5~^. 
If on the other hand 

Af) + 2Af + 3Af +4Af = (4.7) 

then (14. 6 p further reduces to 

E, ,u'^ x+v^ d—w 
x,d^ , 

which clearly vanishes unless 

,/l) -U „(2) (3) (4) _ (1) ^ (2) o„(3) . (4) _ „ / , ^x 

/^i + /^i + /^j + /^i - /^i + 2/^i + 3/ii + 4/i. - U. (4.8j 

We have shown that the inner sum in (14.40 is 0(5"''/^) unless the five linear conditions 
(I4.5p . (l4.7p . (l4.8p are satisfied. The total contribution to (14.40 from cases where one of 
these five conditions is not satisfied is therefore 0(5^^'/^). The total contribution from 
cases when the five conditions are satisfied is 

4 



g-4di-4d2 y^ y^ 



^-w{iJ.,X) ^ 



'=1 m"\a(." 

Since the a^*^ are in arithmetic progression and the 6^*^ satisfy b^^^ — 36*^^^ + 36^^-' — b^^^ , it 
is easy to check that w{fi, A) = when the five conditions are satisfied. It remains only 
to note that, of the s^t^i+^t^a choices for /x. A, the five conditions are satisfied for ^"^'^^+^.2 
of them. n 

If / : Fg — i> C is a i3-measurable function then we write f : Fg^ x Fg^ — ^ C for the 
function which satisfies 

/(x)=f(r(a;),<l>(a;)) 
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for all X G Fg. We will adopt this convention of using bold letters to denote functions 
on configuration space for the rest of these lectures without further comment. 



We are now in a position to prove Theorem I4.1[ 

Proof of Theorem \4-l\ Recall that A C F^ is a set with density a. Apply Proposition 
13.121 to find a quadratic factor (i3i,;B2) with complexity (^1,^2), di ^ do{a,e) and rank 
r satisfying (say) 

r ^ 100(log(l/e) + log(l/a) + di + ^2) 
together with a decomposition 1^ = fi + f2 + fs such that /i = E(lyi|S2), II/2II2 ^ ^ and 
1 1 /a 1 1 1/-^ ^ l/uj{di + d2). The parameter 6 and the growth function u will be specified as 
the proof unfolds, but will depend only on a and e. 

Let rfx, . . . , rj^x be the linear functions involved in Bi, and let H := (ri, . . . , r^^)'^. Let 
1h be the characteristic function of H, and let fin be the normalised measure on H, 
thus fiH '■= 1h/E1h- We are going to prove that 

E^. dlA(x)lA(a; + d)lA{x + 2d)lA{x + 3d)nH{d) ^ a^ - e, (4.9) 

which clearly implies the theorem (for some d G H). To do this, we split the left-hand- 
side of (14. 9 p into 81 parts by substituting 1a = /i + /2 + /s- 

Claim 1. The contribution from any of the 65 terms which contain /2 is no more e/200. 

Proof. Suppose that the term is 

E^^dgiix)g2{x + d)g-i{x + 2d)g^{x + 3d)nHid), (4.10) 

where gi = /2 (the proofs of the other cases are very similar). Set F{x) := Kdg2{x + 
d)g^{x + 2d)gi{x + 3d)fiH{d), and observe that ||-F||oo ^ 1- It follows that 

\E,,dgi{x)92{x + d)g3{x + 2d)g4{x + 3d)nH{d)\ ^ \E,gi{x)F{x)\ ^ II/2II1 ^ II/2II2. 

This proves the claim provided that 6 ^ e/200. 

Claim 2. The contribution from any of the 65 terms which contain /a is no more than 
e/200. 

Proof. Suppose that the term is 

E^,d9i{x)g2{x + d)g^{x + 2d)gi{x + 3d)fiH{d), (4.11) 

where gi = f^ (the proofs of the other cases are very similar). We have 

Inid) = y] lt+H{x + 2d)lt+H{x + d), 



t 



where the sum is over all cosets t + if of if in Fg. By the generalised von Neumann 
theorem (Proposition II . 1 11) . we have 

\E^,dgi{x)g2{x + d)lt+H{x + d)g3{x + 2d)lt+H{x + 2d)g4{x + 3d)\ ^ H/gHt/a ^ 1/07(^1 + ^2) 

for each t. It follows that (14. lip is no more than 5^'^^/cc;((ii + ^2), which proves the claim 
provided that uj{t) ^ 5*"'"''/e. 

Remarks. Note carefully that for Claim 2 to follow we required the regularity parameter 
u;(t) to be exponential in t, rather than (say) polynomial. This is why the full arithmetic 
regularity lemma is required, rather than just the Koopman-von Neumann theorem. 
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These two claims account for 80 of the 81 terms into which we have decomposed the 
left-hand side of (14.91) . To finish the argument, it suffices to show that 

E.,rf/i(x)/i(a; + d)Mx + 2d)h{x + U)iiH{d) ^ a^ - e/2. (4.12) 

Now /i is (by definition) constant on atoms of B2- Recall that these atoms are indexed 
by the configuration space Fg^ x Fg^ , and that we write fi (a, h) for the value of /i on 
the atom indexed by (a, h). 



Claim 3. We have 



^(a,MeF^xF^^fl(«'^)="(l + 0(5''^^"'^"^/'))- (4-13) 



Proof. Note that the result would be trivial (and would hold without the 0-term) if 
all the atoms of E2 had exactly the same size. Now recall that Lemma 14.21 gives an 
approximate version of this statement. We leave the slightly tedious details to the 
reader. 

Claim 4- We have 

E.,d/i(x)/i(x + d)f^{x + 2d)h{x + U)^iH{d) 

b(i)_3fe(2) +36(3) -6(4) =0 

Proof. Condition on the quadruple {a'^^\ fe*-^-*), . . . , {a^^\ b^^^) of atoms containing (x, x + 
d, x+2d, x+3d). The constraint that d G H is equivalent to a^^^ = a^"^^ = a'-^-' = a^^-* = a, 
say. By Lemma SSI we must also have 6'-^-' — Sfo^^-* + 36'-'^^ — 6'-^^ = 0. Invoking that same 
lemma, we have 

E.,rf/i(x)/i(x + d)f,ix + 2d)f,{x + 3d)lHid) 
^(5-2d,-3d,^Q(5-r/2)) J2 J2 fi(a,6(i))fi(a,&(2))f,(a,6(3))fi(a,6(4)) 

aeF^i 6(i),...,6(4)eF^2 



"5 



6(i)_36(2)+36(3)_b(4)=o 

Normalising, we obtain the stated result. 

Now the rank r was chosen very large (r > 100(log(l/e) + log(l/a) + di + ^2))- All we 
need do to establish (I4.12p . then, is prove the inequality 

^ aeF^^6(^),...,6(4)eF^2 fi(a,6«)fi(a,6(2))fi(a,6(3))fi(a,6(^)) ^ (E(^^,^^^.,^^.,fi(a, 6))'. 

6(1) _36(2) +36(3) -6(4) =0 

(4.14) 
Noting that the left-hand side is 

^aewt^KeFp (^6,6'eF^2fi(a, 6)fi(a, 6'))^ 

6-36' =z 

this follows from two applications of the Cauchy-Schwarz inequality. 

Alternatively, it is amusing to give an interpretation in terms of the Fourier transform. 
The left-hand side of ( KW) is 
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^aeF^ E \Ua,r)\'\Ua,-3r)\'. (4.15) 



rGF^2 

In this expression the tilde denotes Fourier transform in the second variable, which was 
called b in (SHI). 

A lower bound for (I4.15P comes from ignoring all terms except those with r = 0, yielding 

E„eFn|fi(a,0)|^ = Ea^^n\E^^^,Ma,b)\\ 

The result now follows from Holder's inequality. 

A more interesting application of these partial Fourier transforms may be found in [19]. 

5. Lecture 5 

Topics to be covered 

• An introduction to the theory on Z/A^Z. 

For simplicity I will assume that A^ is a large prime. 

I am only scheduled to give four lectures at the school. These notes are here for two 
reasons: firstly, it is possible that I will finish the material from the first four lectures 
early. More importantly, it is the theory on the group Z/A^Z that is of most interest for 
applications in number theory, and it would be remiss of me to not at least point the 
reader in directions where she may learn more. 

Note that the theory on Z/A^Z is actually rather richer than for an arbitrary abelian 
group G, because we have been able to pursue analogies with ergodic theory. This 
is concerned with Z-actions, and Z/A^Z is the finite abelian group which most closely 
models Z. 

One way of motivating the theory is to try and take what we know for Fg and attempt 
to adapt it to Z/A^Z. Let us note that the basic definitions of Gowers norms and the 
basic generalised von Neumann theorems of Lecture 1 go over essentially unchanged to 
Z/A^Z. The first stumbling block comes at the point where we ask for a conjectural 
analogue of Proposition 12. 2[ A first guess might be: 

Conjecture 5.1. Suppose that f : Z/A^Z -^ [—1, 1] is a function with \\f\\u^ ^ 6. Then 
there are r,s & Z/A^Z such that 

\^x&/Nif{x)e[ — )| >5 1. 

Remark. As usual in analytic number theory we have written e{6) := e 



2me 



It turns out that this conjecture is false. One example of a function on Z/A^Z which 
has large ?7^-norm, but does not correlate with a quadratic form e{rx^ + sx/N), is a 
quadratic e{dx^) where 6 ^ r/N. Such a quadratic is most naturally defined on Z, but 
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by restricting its domain to {1, . . . , A^} one obtains a function which can be defined 
on Tj/NTj. Another example is a "bracket quadratic" such as e{6ix{92x}), where {t} 
denotes the fractional part of t. The second of these counterexamples is somehow more 
serious, but it is also rather harder to see that this rather exotic function does provide 
a counterexample to Conjecture 15.11 For a brief discussion see [IHl §6], and for more 
detail see Il8l. 



If the only obvious generalisation of Proposition 12.21 is wrong, how should we proceed? 
It turns out that a hint is given to us by the quantitatively stronger form of the inverse 
theorem for the t/^-norm on Fg , namely Proposition 12.91 We are not concerned with 
quantitative issues here, so let us state a weak consequence of that result. This is 
actually a trivial consequence of Proposition 12.21 too. 

Proposition 5.2 (Inverse result for f/^-norm on Fg, III). Suppose that / : Fg — > [—1, 1] 
is a function with ||/||t/3 ^ 5. Then there is a subspace H ^¥^ with codimi? <t^s 1? o 
matrix M G 0Jl„(F5) and a vector r G Fg such that 

Remark. It is not too hard to show that this is equivalent to Proposition 12. 2t we leave 
this as an exercise to the reader. 

Let us try and generalise this result. There are two objects which do not obviously 
transfer to Z/A^Z: the notion of subspace, and (implicitly) the notion of quadratic form. 
It turns out that the second notion can be sensibly formulated for functions defined on 
any set. 

Definition 5.3 (Quadratic forms). Let S" be a set in some abelian group, and let 
ip : S ^ M/Z be a function. We say that ^/^ is a quadratic form if the second derivative 

<(/ii, /is) := ^{x + hi + h^) - ip{x + hi) - ^'(x + h^) + ^{x) 

is well-defined, that is to say if this definition does not depend on x whenever x, z + 
hi, X + h2, X + hi + h2 G S. 

Whilst the notion of subspace is rather vacuous in Z/NZ, there is a plentiful supply of 
approximate subspaces. These are more usually called Bohr sets. 

Definition 5.4 (Approximate subspaces/Bohr sets). Let R = {ri,...,rfc} C Z/NZ 
and let e > 0. Then we write 

B{R,e) := {x G Z/NZ : \e{rx/N) - 1\ ^ e}. 

This is called the Bohr set with width e corresponding to frequency set R. 

The set R should actually be thought of as a set of characters on Z/A^Z, each value r 
corresponding to the character x i— > e{rx/N). Once thought of in this way, it is easy to 
see how Bohr sets can be defined on any finite abelian group G. Bohr sets on Fg do not 
depend very seriously on the width parameter e, and certainly for e < 1/10 (say) they 
are just vector subspaces. 

There is a lot to say about Bohr sets, and much information may be found in [M]. See 
also [T^, where there is a discussion of the place of Bohr sets in the transition from 
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finite field models to Z/A^Z in various settings. We caution the reader that there are 
certain technicalities associated with the study of Bohr sets in additive combinatorics, 
most particularly the need to consider regular Bohr sets (ones that "behave well at the 
edges" ). In this brief overview we will say nothing more about these technicalities, other 
than that most of them were overcome in a seminal paper of Bourgain [1] . 

To return to the point, we may now state Theorem 2.7 (i) of [18j, which is an inverse 
theorem for the [/^-norm on Z/A^Z. In the light of the above discussion, the reader will 
see that it is a natural generalisation of Proposition 15.21 

Proposition 5.5 (Inverse theorem for the f/'^-norm on Z/A^Z, I). Suppose that f : 
Z/A^Z -^ [—1, 1] is a function and that \\f\\u^ ^ S. Then there is a set R C Z/A^Z, 
\R\ <^s 1; 0, parameter e 3>5 1 such that the Bohr set B := B{R,e) is regular, some 
y e Z/A^Z and a quadratic form ip : y + B ^ M/Z such that 

|E./(x)l,+B(x)e(^(a;))|>5l. (5.1) 

It turns out that result is necessary and sufficient, that is to say if (15. ip is satisfied then 
ll/llt/3 is large. See [IB], Thm 2.7 (n) (note that this is the only point at which the 
regularity of B{R,e) is relevant). This is, at first sight, a very unsatisfactory state of 
affairs: we have a theorem which gives a necessary and sufficient condition for a natural 
problem which interests us, yet the theorem is somewhat inelegant and difficult to state. 

Our subject being in some sense an extension of the work of Hardy and Littlewood, one 
should perhaps recall at this point Hardy's view that there is "no permanent place in 
the world for ugly mathematics" . 

With this in mind we observe that although Proposition 15.51 is necessary and sufficient, 
it need not be the only necessary and sufficient condition. In what follows we will 
be rather vague. Write Q = Q{S) for the collection of all "quadratic obstructions" of 
the form ly_|_B(x)e('?/'(x)), where B,iIj are as above. Any other collection Q' with the 
property that anything in Q is approximately a linear combination of elements in Q', 
and vice versa, will also be a necessary and sufficient collection of quadratic obstructions 
for Z/NZ. 

It turns out that there is a very natural choice for Q', the collection oi2-step nilsequences. 
The idea that we should look at these objects came to us from ergodic theory - there 
will be much more on this in the lectures of Bryna Kra at the school. 

Let G be a connected, simply-connected 2-step nilpotent Lie group over M and let F ^ G 
be a discrete, cocompact submanifold. The quotient G/T is called a 2-step nilmanifold. 
For the sake of illustration, we recommend that the reader take 



G:-- 



in which case G/T is a 3-dimensional compact manifold called the Heisenherg nilmani- 
fold. 
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Let g E G and x G G/T be arbitrary. The element g induces a continuous map 
Tg : G/V — > G/r by multiplication on the left. Any sequence of the form {F{Tg ■ x))„gN, 
where F : G/T -^ [—1,1] is continuous, is called a 2-step nilsequence. It turns out that 
the collection of 2-step nilsequences can play the role of Q' as discussed above. The 
following is proved in [TB], Thm. 12.8. 

Proposition 5.6 (Inverse Theorem for the f/^-norm on Z/A^Z, II). Let f : Z/NI^ —>■ 

[—1,1] be a function, and suppose that ||/||t/3 ^ 6. Then there is a 2-step nilsequence 
{F{T^ ■ x))„gN with complexity ^5 1 such that 

|e„^^/Hf(t;-x)|>5 1. 

If, conversely, f correlates with a 2-step nilsequence of hounded complexity then the 
II ■ II (73 -norm of f is large. 

We have not defined the complexity of a nilsequence. It is some number associated to 
{FiT^I ■ x))n£ni which bounds both the dimension of the underlying nilmanifold G/T , 
and also the Lipschitz constant of F with respect to some sensible metric. There is no 
canonical way of defining the complexity, but this is not important for the theory. 

We do not attempt to explain why this collection Q of 2-step nilsequences is "equiva- 
lent" to the collection Q used in Proposition 15.51 Detailed technical discussions may be 
found in [TSII^ . A short calculation involving the Heisenberg example, showing how a 
2-step nilsequence on it resembles a quadratic form on a Bohr set, is given in [15j . 
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